The complete data set used to conduct this analysis is the “GAA” data set comprising of data related to 1994 National Football League (NFL) games which occurred from 2001 to 2017. The NFL is an annual Gaelic football competition which is contested by the senior county teams. The teams are divided into four divisions with team ability decreasing with the division numbers. The teams initially play each other once receiving league points for each win. Towards the end of the competition the top ranked teams play each other in the knockout stages to determine the overall winner.
Each game was described in the data using 13 variables initially. These variables contained data describing the teams playing (Team_Name and Opp_Name), the date of the game (Date), the round of the competition (Game_Round), the outcome for the team named in the Team_Name variable (Team_Outcome, the goals, points and total score of each team (Team_Points, Team_Goals, Team_Score, Opp_Points, Opp_Goals and Opp_Score), the venue the match took place in (Venue) and which division the match took place in (Division).
Our question of interest is the impact of Croke Park on Dublin Gaelic Football team performance. This is a relevant topic in contemporary Irish sport arising due to the dominance of the Dublin team in recent years. There is much debate in the GAA regrading whether Dublin has an advantage when playing games in Croke Park which is the venue in which most important games of a given season occur and is situated in Dublin. Therefore, Dublin will play their most important games in their home county in contrast to every other county team which will have to travel to Croke Park. Croke Park is the national stadium of Ireland
load("gaa.Rdata")
# These are the teams have played the highest number of matches in Croke Park.
# Filter by all Divisiion 1 Dublin matches. Then create a new Winner column which uses the Scores to identify the winnner of the match. This shows whether our team in question won the match, the other team won or if it was a tie.
dub_gaa <- gaa %>% filter(Opp_Name == "DUBLIN" | Team_Name == "DUBLIN") %>% filter(Division == "1")%>%
mutate(Winner = case_when(Team_Name == "DUBLIN" & Team_Score<Opp_Score ~ 'DUBLIN',
Team_Name == "DUBLIN" & Team_Score>Opp_Score ~ 'OTHER TEAM',
Opp_Name == "DUBLIN" & Team_Score<Opp_Score ~ 'DUBLIN',
Opp_Name == "DUBLIN" & Team_Score>Opp_Score ~ 'OTHER TEAM',
Opp_Name == "DUBLIN" & Team_Score==Opp_Score ~ 'TIE',
Team_Name == "DUBLIN" & Team_Score==Opp_Score ~ 'TIE'))
# Filter by all Divisiion 1 Cork matches. Then create a new Winner column which uses the Scores to identify the winnner of the match. This shows whether our team in question won the match, the other team won or if it was a tie.
cork_gaa <- gaa %>% filter(Opp_Name == "CORK" | Team_Name == "CORK") %>% filter(Division == "1")%>%
mutate(Winner = case_when(Team_Name == "CORK" & Team_Score<Opp_Score ~ 'CORK',
Team_Name == "CORK" & Team_Score>Opp_Score ~ 'OTHER TEAM',
Opp_Name == "CORK" & Team_Score<Opp_Score ~ 'CORK',
Opp_Name == "CORK" & Team_Score>Opp_Score ~ 'OTHER TEAM',
Opp_Name == "CORK" & Team_Score==Opp_Score ~ 'TIE',
Team_Name == "CORK" & Team_Score==Opp_Score ~ 'TIE'))
# Filter by all Divisiion 1 Mayo matches. Then create a new Winner column which uses the Scores to identify the winnner of the match. This shows whether our team in question won the match, the other team won or if it was a tie.
mayo_gaa <- gaa %>% filter(Opp_Name == "MAYO" | Team_Name == "MAYO") %>% filter(Division == "1")%>%
mutate(Winner = case_when(Team_Name == "MAYO" & Team_Score<Opp_Score ~ 'MAYO',
Team_Name == "MAYO" & Team_Score>Opp_Score ~ 'OTHER TEAM',
Opp_Name == "MAYO" & Team_Score<Opp_Score ~ 'MAYO',
Opp_Name == "MAYO" & Team_Score>Opp_Score ~ 'OTHER TEAM',
Opp_Name == "MAYO" & Team_Score==Opp_Score ~ 'TIE',
Team_Name == "MAYO" & Team_Score==Opp_Score ~ 'TIE'))
# Filter by all Divisiion 1 Kerry matches. Then create a new Winner column which uses the Scores to identify the winnner of the match. This shows whether our team in question won the match, the other team won or if it was a tie.
kerry_gaa <- gaa %>% filter(Opp_Name == "KERRY" | Team_Name == "KERRY") %>% filter(Division == "1")%>%
mutate(Winner = case_when(Team_Name == "KERRY" & Team_Score<Opp_Score ~ 'KERRY',
Team_Name == "KERRY" & Team_Score>Opp_Score ~ 'OTHER TEAM',
Opp_Name == "KERRY" & Team_Score<Opp_Score ~ 'KERRY',
Opp_Name == "KERRY" & Team_Score>Opp_Score ~ 'OTHER TEAM',
Opp_Name == "KERRY" & Team_Score==Opp_Score ~ 'TIE',
Team_Name == "KERRY" & Team_Score==Opp_Score ~ 'TIE'))
# There are multiple empty Venue variables so I export these datasets as excel files.
write.xlsx(dub_gaa, file = "dub_gaa.xlsx",
sheetName = "GAA", append = FALSE)
write.xlsx(kerry_gaa, file = "kerry_gaa.xlsx",
sheetName = "GAA", append = FALSE)
write.xlsx(cork_gaa, file = "cork_gaa.xlsx",
sheetName = "GAA", append = FALSE)
write.xlsx(mayo_gaa, file = "mayo_gaa.xlsx",
sheetName = "GAA", append = FALSE)
# Adding in the new variables
# In Excel I added the missing information in the dataset and saved as new excel file. Then uploaded into R again.
load("gaa.Rdata")
cork_gaa <- read_excel("cork_gaa.xlsx")
dub_gaa <- read_excel("dub_gaa.xlsx")
kerry_gaa <- read_excel("kerry_gaa.xlsx")
mayo_gaa <- read_excel("mayo_gaa.xlsx")
# Filter the datasets by Croke Park
dublin <- dub_gaa %>% filter(Venue == "CROKE PARK")
cork <- cork_gaa %>% filter(Venue == "CROKE PARK")
mayo <- mayo_gaa %>% filter(Venue == "CROKE PARK")
kerry <- kerry_gaa %>% filter(Venue == "CROKE PARK")
# Bind these together into a new dataset
new <- rbind(dublin,cork,kerry,mayo)
This next plot shows the top 4 teams and their performance at Croke Park. We can clearly see that Dublin plays in Croke Park at a much higher frequency, does this give them an advantage in and of itself? We can also clearly see that Dublin’s win to loss ratio is much much higher than the other 4 teams.
This next plot shows Dublin’s performance in its most played stadiums. Again we see that Dublin has played in Croke Park very frequently and again its win to loss ratio is much higher in Croke Park than in other stadiums.
These next plots repeat the same information, except this time in pie chart form. This shows the porportion of wins instead of the number. It shows that Dublin is the 2nd most wins with this, with Cork having a higher percentage of wins in Croke Park. The next plot strongly highlights that Dublin has the highest percentage of wins in Croke Park than in any other stadium.
This interactive plot highlights the winning team scores across all their matches they won in Croke Park, and Dublin’s scores in their top played venues. It shows that Dublin actually has scored the highest pointwise in their games in Croke Park, both against the other teams who have played in Croke Park and against the other venues they have played. This again shows that Dublin does play better in Croke Park than in other venues.
a concluding paragraph summarising your finds.
I, Con Anthony Mc Cord had primary responsibility for the material in XXXXX.
I, Jade Eve Sweeney had primary responsibility for the material in XXXXX.
I, Mikaela Jaya Sminia had primary responsibility for the material in XXXXX.
Although each team member had a section for which they were primarily responsible the input of each team member was invaluable for each one of us in completing our individual sections. This report required the skills of each team member in its completion.